-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Dataloading Followup #3604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataloading Followup #3604
Conversation
|
Hello, I am getting this bug when using load from disk. It is intermittent and works again after re-launching training. |
akristoffersen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I don't immediately see what would fix the cam_idx metadata from the cameras to show up here, but if you've validated it on your side its good with me!
| Here, the variable 'batch' refers to the output of our pixel sampler. | ||
| - batch is a dict_keys(['image', 'indices']) | ||
| - batch['image'] returns a pytorch tensor with shape `torch.Size([4096, 3])` , where 4096 = num_rays_per_batch. | ||
| - batch['image'] returns a `torch.Size([4096, 3])` tensor on CPU, where 4096 = num_rays_per_batch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we support rgba supervision here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this does! When an RGBA image is present in the dataset, it gets converted into RGB format in the InputDataset.
Specifically this is what happens:
- dataloaders.py's
RayBatchStreamwill callself.input_dataset.__getitem__ InputDataset's__getitem__()method callsself.get_dataget_datawill callget_image_float32get_image_float32has the code for RGBA support
|
@akristoffersen It's a little obscure how the camera metadata gets fixed but when a worker process sends a pytorch tensor or a tensor dataclass object (like cameras) to the main process, this tensor or tensor dataclass obj has to be on the CPU device. If it is on the GPU device, it will have CUDA context errors and/or other unpredictable behavior like the ones @abrahamezzeddine experienced Here are some links talking about this: |
This PR is a followup to #3216 - its purpose is to add further documentation, fix bugs, and improve clarity
Problems and Background
.to(self.device)), these GPU tensors cannot be properly serialized back to the main process! PyTorch attempts to serialize them but fails silently, resulting in zeroed tensors.method_configs.pyfile to resolve this.ns-export camerascan also be resolved with the GPU fixesOverview of Changes
TODOs